智能论文笔记

Tag-Aware Document Representation for Research Paper Recommendation

Hebatallah A. Mohamed , Giuseppe Sansonetti , Alessandro Micarelli

分类：机器学习

2022-09-08

由于出版物的数量越来越多，找到与人的利益相关的在线研究论文非常具有挑战性。因此，个性化的研究论文建议已成为一个重要且及时的研究主题。协作过滤是一种成功的推荐方法，它利用用户给出的项目的评分作为学习的信息来源，以提出准确的建议。但是，由于每年的出版物数量大量增长，评级通常非常稀少。因此，人们对考虑评级和内容信息的混合方法有了更多的关注。然而，基于文本嵌入的大多数混合推荐方法都使用了词袋技术，它们忽略了单词顺序和语义含义。在本文中，我们提出了一种混合方法，该方法基于用户分配的社会标签来利用研究论文的深层语义表示。实验评估是对Citeulike进行的，Citeulike是一个真实且公开可用的数据集。获得的发现表明，即使评级数据非常稀疏，提出的模型也可以有效推荐研究论文。

translated by 谷歌翻译

Geolocation of Cultural Heritage using Multi-View Knowledge Graph Embedding

Hebatallah A. Mohamed , Sebastiano Vascon , Feliks Hibraj , Stuart James , Diego Pilutti , Alessio Del Bue , Marcello Pelillo

分类：机器学习 | 自然语言处理

2022-09-08

知识图（kgs）已被证明是构建数据的可靠方法。他们可以提供有关文化遗产收藏的丰富情境信息。但是，文化遗产库库远非完整。他们通常会缺少重要的属性，例如地理位置，尤其是对于雕塑，移动或室内实体，例如绘画。在本文中，我们首先提出了一个框架，用于从各种数据源及其连接的多跳知识中汲取有关有形文化遗产实体的知识。其次，我们提出了一个多视图学习模型，用于估计给定的文化遗产实体之间的相对距离，该模型基于实体的地理和知识联系。

translated by 谷歌翻译

A Comprehensive Review on Autonomous Navigation

Saeid Nahavandi , Roohallah Alizadehsani , Darius Nahavandi , Shady Mohamed , Navid Mohajer , Mohammad Rokonuzzaman , Ibrahim Hossain

分类：机器人

2022-12-24

The field of autonomous mobile robots has undergone dramatic advancements over the past decades. Despite achieving important milestones, several challenges are yet to be addressed. Aggregating the achievements of the robotic community as survey papers is vital to keep the track of current state-of-the-art and the challenges that must be tackled in the future. This paper tries to provide a comprehensive review of autonomous mobile robots covering topics such as sensor types, mobile robot platforms, simulation tools, path planning and following, sensor fusion methods, obstacle avoidance, and SLAM. The urge to present a survey paper is twofold. First, autonomous navigation field evolves fast so writing survey papers regularly is crucial to keep the research community well-aware of the current status of this field. Second, deep learning methods have revolutionized many fields including autonomous navigation. Therefore, it is necessary to give an appropriate treatment of the role of deep learning in autonomous navigation as well which is covered in this paper. Future works and research gaps will also be discussed.

translated by 谷歌翻译

Hardware Acceleration of Lane Detection Algorithm: A GPU Versus FPGA Comparison

Mohamed Alshemi , Sherif Saif , Mohamed Taher

分类：计算机视觉

2022-12-19

A Complete Computer vision system can be divided into two main categories: detection and classification. The Lane detection algorithm is a part of the computer vision detection category and has been applied in autonomous driving and smart vehicle systems. The lane detection system is responsible for lane marking in a complex road environment. At the same time, lane detection plays a crucial role in the warning system for a car when departs the lane. The implemented lane detection algorithm is mainly divided into two steps: edge detection and line detection. In this paper, we will compare the state-of-the-art implementation performance obtained with both FPGA and GPU to evaluate the trade-off for latency, power consumption, and utilization. Our comparison emphasises the advantages and disadvantages of the two systems.

translated by 谷歌翻译

Multimodal CNN Networks for Brain Tumor Segmentation in MRI: A BraTS 2022 Challenge Solution

Ramy A. Zeineldin , Mohamed E. Karar , Oliver Burgert , Franziska Mathis-Ullrich

分类：计算机视觉 | 机器学习

2022-12-19

Automatic segmentation is essential for the brain tumor diagnosis, disease prognosis, and follow-up therapy of patients with gliomas. Still, accurate detection of gliomas and their sub-regions in multimodal MRI is very challenging due to the variety of scanners and imaging protocols. Over the last years, the BraTS Challenge has provided a large number of multi-institutional MRI scans as a benchmark for glioma segmentation algorithms. This paper describes our contribution to the BraTS 2022 Continuous Evaluation challenge. We propose a new ensemble of multiple deep learning frameworks namely, DeepSeg, nnU-Net, and DeepSCAN for automatic glioma boundaries detection in pre-operative MRI. It is worth noting that our ensemble models took first place in the final evaluation on the BraTS testing dataset with Dice scores of 0.9294, 0.8788, and 0.8803, and Hausdorf distance of 5.23, 13.54, and 12.05, for the whole tumor, tumor core, and enhancing tumor, respectively. Furthermore, the proposed ensemble method ranked first in the final ranking on another unseen test dataset, namely Sub-Saharan Africa dataset, achieving mean Dice scores of 0.9737, 0.9593, and 0.9022, and HD95 of 2.66, 1.72, 3.32 for the whole tumor, tumor core, and enhancing tumor, respectively. The docker image for the winning submission is publicly available at (https://hub.docker.com/r/razeineldin/camed22).

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Image augmentation with conformal mappings for a convolutional neural network

Oona Rainio , Mohamed M. S. Nasser , Matti Vuorinen , Riku Klén

分类：计算机视觉

2022-12-10

For augmentation of the square-shaped image data of a convolutional neural network (CNN), we introduce a new method, in which the original images are mapped onto a disk with a conformal mapping, rotated around the center of this disk and mapped under such a M\"obius transformation that preserves the disk, and then mapped back onto their original square shape. This process does not result the loss of information caused by removing areas from near the edges of the original images unlike the typical transformations used in the data augmentation for a CNN. We offer here the formulas of all the mappings needed together with detailed instructions how to write a code for transforming the images. The new method is also tested with simulated data and, according the results, using this method to augment the training data of 10 images into 40 images decreases the amount of the error in the predictions by a CNN for a test set of 160 images in a statistically significant way (p-value=0.0360).

translated by 谷歌翻译

Towards Next Generation of Pedestrian and Connected Vehicle In-the-loop Research: A Digital Twin Simulation Framework

Zijin Wang , Ou Zheng , Liangding Li , Mohamed Abdel-Aty , Carolina Cruz-Neira , Zubayer Islam

分类：机器人

2022-12-08

Digital Twin is an emerging technology that replicates real-world entities into a digital space. It has attracted increasing attention in the transportation field and many researchers are exploring its future applications in the development of Intelligent Transportation System (ITS) technologies. Connected vehicles (CVs) and pedestrians are among the major traffic participants in ITS. However, the usage of Digital Twin in research involving both CV and pedestrian remains largely unexplored. In this study, a Digital Twin framework for CV and pedestrian in-the-loop simulation is proposed. The proposed framework consists of the physical world, the digital world, and data transmission in between. The features for the entities (CV and pedestrian) that need digital twined are divided into external state and internal state, and the attributes in each state are described. We also demonstrate a sample architecture under the proposed Digital Twin framework, which is based on Carla-Sumo Co-simulation and Cave automatic virtual environment (CAVE). The proposed framework is expected to provide guidance to the future Digital Twin research, and the architecture we build can serve as the testbed for further research and development of ITS applications on CV and pedestrian.

translated by 谷歌翻译

A domain adaptive deep learning solution for scanpath prediction of paintings

Mohamed Amine Kerkouri , Marouane Tliba , Aladine Chetouani , Alessandro Bruno

分类：计算机视觉

2022-09-22

文化遗产的理解和保存对于社会来说是一个重要的问题，因为它代表了其身份的基本方面。绘画代表了文化遗产的重要组成部分，并且是不断研究的主题。但是，观众认为绘画与所谓的HVS（人类视觉系统）行为严格相关。本文重点介绍了一定数量绘画的视觉体验期间观众的眼动分析。在进一步的详细信息中，我们引入了一种新的方法来预测人类的视觉关注，这影响了人类的几种认知功能，包括对场景的基本理解，然后将其扩展到绘画图像。拟议的新建筑摄入图像并返回扫描路径，这是一系列积分，具有引起观众注意力的很有可能性。我们使用FCNN（完全卷积的神经网络），其中利用了可区分的渠道选择和软弧度模块。我们还将可学习的高斯分布纳入网络瓶颈上，以模拟自然场景图像中的视觉注意力过程偏见。此外，为了减少不同域之间的变化影响（即自然图像，绘画），我们敦促模型使用梯度反转分类器从其他域中学习无监督的一般特征。在准确性和效率方面，我们的模型获得的结果优于现有的最先进的结果。

translated by 谷歌翻译

A Few Shot Multi-Representation Approach for N-gram Spotting in Historical Manuscripts

Giuseppe De Gregorio , Sanket Biswas , Mohamed Ali Souibgui , Asma Bensalah , Josep Lladós , Alicia Fornés , Angelo Marcelli

分类：计算机视觉

2022-09-21

尽管最近的自动文本识别取得了进步，但在历史手稿方面，该性能仍然保持温和。这主要是因为缺乏可用的标记数据来训练渴望数据的手写文本识别（HTR）模型。由于错误率的降低，关键字发现系统（KWS）提供了HTR的有效替代方案，但通常仅限于封闭的参考词汇。在本文中，我们提出了一些学习范式，用于发现几个字符（n-gram）的序列，这些序列需要少量标记的训练数据。我们表明，对重要的n-gram的认识可以减少系统对词汇的依赖。在这种情况下，输入手写线图像中的vocabulary（OOV）单词可能是属于词典的n-gram序列。对我们提出的多代表方法进行了广泛的实验评估。

translated by 谷歌翻译